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A NOVEL fll-*6 W-ACETYLGLUCOSAMINYLTRANSFERASE , 
ITS ACCEPTOR MOLECULE , LEUKOS I ALIN, AND 
A METHOD FOR CLONING PROTEINS HAVING ENZYMATIC ACTIVITY 

This work was supported by grants CA3300 0 and 
5 CA33895 awarded by the National Cancer Institute. The 
United States Government has certain rights in this 
invention. 

BACKGROUND OF THE INVENTION 

FIELD OF THE INVENTION 

10 This invention relates generally to the fields of 

biochemistry and molecular biology and more specifically to 
a novel human enzyme, UDP-GlcNAc : GalBl->3GalNAc (GlcNAc to 
GalNAc) fil-*6 N-acetylglucosaminyltransf erase (core 2 
77- acetylglucosaminyltransf erase; C2GnT) , and to a novel 

15 acceptor molecule, leukosialin, CD43 f for core 2 Bl->6 N- 
acetylglucosaminyltransf erase action. The invention 

additionally relates to DNA sequences encoding core 2 131-*6 
N-acetylglucosaminyltransf erase and leukosialin, to vectors 
containing a C2GnT DNA sequence or a leukosialin DNA 

20 sequence, to recombinant host cells transformed with such 
vectors and to a method of transient expression cloning in 
CHO cells for identifying and isolating DNA sequences 
encoding specific proteins, using CHO cells expressing a 
suitable acceptor molecule. 

25 BACKGROUND INFORMATION 

Most O-glycosidic oligosaccharides in mammalian 
glycoproteins are linked via ^/-acetylgalactosamine to the 
hydroxyl groups of serine or threonine. These O-glycans 
can be classified into 4 different groups depending on the 
30 nature of the core portion of the oligosaccharides (see 
Fig. 1). Although less well studied than N-glycans, O- 
glycans likely have important biological functions. 
Indeed, the presence of O-linked oligosaccharides with the 
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core 2 branch, GaLBl-*3 (GlcNAcJ31-*6 ) GalNAc , has been 
demonstrated in nany biological processes. 



Piller et al., J. Biol. Chem 263: 15146-15150 
(1988) reported that human T-cell activation is associated 
5 with the conversion of core 1-based tetrasaccharides to 
core 2-based hexasaccharides on leukosialin, a major 
sialoglycqprotein present on human T lymphocytes (see also 
Fig. 1). A similar increase in hexasaccharides was 
observed in peripheral blood lymphocytes of patietts 
10 suffering from T-cell leukemias (Saitoh et al., Blood 
77 : 1491-1499 (1991)), myelogenous leukemias ( Brockhausen et 
al., Cancer Res. 51:1257-1263 (1991)) and immunodeficiency 
due to AIDS and the Wiskott-Aldrich syndrome (Piller et 

ai -> i Exp. Med. 173:1501-1510 (1991)). In these 

15 patients' lymphocytes, changes in the amount of 
hexasaccharides were caused by increased activity of either 
UDP-GlcNAc:Galfll-»3GalNAc (GlcNAc to GalNAc) 
acetylglucosaminyltransf erase (EC2 . 4 . 1 . 102 ) or core 2 M-*6 
W-acetylglucosaminyltransf erase (Williams et al., J. Biol. 
20 Chem. 255:11253-11261 (1980)). Increased activity of core 
2 fil-*6 W-acetylglucosaminyltransf erase also was observed in 
metastatic murine tumor cell lines as compared to their 
parental, non-metastatic counterparts (Yousefi et al . , J. 
Biol. Chem. 266:1772-1782 (1991)). 

Increased complexity of the attached 
oligosaccharides increases the molecular weight of the 
glycoprotein. For example, leukosialin containing 

hexasaccharides has a molecular weight of ~135kDa, whereas 
leukosialin containing tetrasaccharides has a molecular 
weight of -105kDa (Carlsson et al., J. Biol. Chem. 
261:12779-12786 and 12787-12795 (1986)). 

Fox et al., J. Immunol. 131:762-767 (1983) raised 
a monoclonal antibody, T305, against human T- lymphocytic 
leukemia cells. Sportsman et al., J. Immunol. 135:158-164 
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(1985) reported T305 binding was abolished by neuraminidase 
treatment, suggesting T305 binds to hexasaccharides . T305 
specifically reacts with the high molecular weight form of 
leukosialin (Saitoh et al., supra , (1991)). 

Previous studies indicated poly-JY- 
acetyllactosamine repeats extend almost exclusively from 
the branch formed by the core 2 fll-»6 tf- 
acetylglucosami nyltransf erase (Fukuda et al., J. Biol. 
Chem. 261:12796-12806 (1986)). Consistent with these 
results, Yousefi et al., supra , (1991) demonstrated that 
the core 2 enzyme in metastatic tumor cells regulates the 
level of poly-N-acetyllactosamine synthesis in O-linked 
oligosaccharides . 

Poly-AT-acetyllactosamines are subject to a 
variety of modifications, including the formation of the 
sialyl Le x , NeuNAca2-»3Galfil->4 ( Fucal->3 ) GlcNAc- , or the sialyl 
Le a , NeuNAca2->3GalJ31-»3 (Fucal->4 ) GlcNAc-, determinants 
(Fukuda, Biochim. Biophys . Acta 780:119-150 (1985)). Such 
modifications are significant because these determinants, 
which are present on neutrophils and monocytes, serve as 
ligands for E- and P-selectin present on endothelial cells 
and platelets, respectively (see, for example, Larsen et 
al., Cell 63:467-474 (1990)). 

In addition, tumor cells often express a 
25 significant amount of sialyl Le 1 and/or sialyl Le* on their 
cell surfaces. The interaction between E-selectin or P- 
selectin and these cell surface carbohydrates may play a 
role in tumor cell adhesion to endothelium during the 
metastatic process (Walz et al., supra , (1990)). Kojima et 
30 al., Biochem. Biophvs. Res. Commun. 182:1288-1295 (1992) 
reported that selectin-dependent tumor cell adhesion to 
endothelial cells was abolished by blocking O-glycan 
synthesis. Complex sulfated O-glycans also may serve as 
ligands for the lymphocyte homing receptor, L-selectin 
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(Imai et al . , J. Cell Biol. 113:1213-122 1 (1991)). 

These reported observations establish core 2 fll-*6 
N-acetylglucosaminyltransf erase as a critical enzyme in O- 
glycan biosynthesis. The availability of core 2 fll->6 N- 
5 acetylglucosaminyltransf erase will allow the in vivo and in 
vitro production of specific glycoproteins having core 2 
oligosaccharides and subsequent study of these variant O- 
glycans on cell-cell interactions. For example, core 2 
fll-*6 ^-acetylglucosaminyltransf erase is a useful marker for 

10 transformed or cancerous cells. An understanding of the 
role of core 2 J31->6 ^-acetylglucosaminyltransf erase in 
transformed and cancerous cells may elucidate a mechanism 
for the aberrant cell-cell interactions observed in these 
cells. In order to understand the control of expression of 

15 these oligosaccharides and their function, isolation of a 
cDNA clone for core 2 J31+6 ^-acetylglucosaminyltransf erase 
is a prerequisite. However, the DNA sequence encoding core 
2 J31-*6 ^-acetylglucosaminyltransf erase has not yet been 
reported. 

20 Thus, a need exists for identifying the core 2 

fll-»6 ^-acetylglucosaminyltransf erase and the DNA sequences 
encoding this enzyme. The present invention satisfies this 
need and provides related advantages as well. 

SUMMARY OF THE INVENTION 

25 The present invention generally relates to a 

novel purified human fll->6 N-acetylglucosaminyltransf erase . 
A cDNA sequence encoding a 428 amino acid protein having 
J31-*6 ^-acetylglucosaminyltransf erase activity also is 
provided. The purified human iil->6 N- 

30 acetylglucosaminyltransf erase, or an active fragment 
thereof, catalyzes the formation of critical branches in O- 
glycans. 
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The invention further relates to a novel purified 
acceptor molecule, leukosialin, CD43, for core 2 fll-*6 W- 
acetylglucosaminyltrans f erase activity . The leukosialin 
cDNA encodes a novel variant leukosialin, which is created 
5 by alternative splicing of the genomic leukosialin DNA 
sequence. 



t Isolated nucleic acids encoding either core 2 
fil-»6 tf-acetylglucosaminyltransferase or leukosialin are 
disclosed, as are vectors containing the nucleic acids and 

10 recombinant host cells transformed with such vectors. The 
invention further provides methods of detecting such 
nucleic acids by contacting a sample with a nucleic acid 
probe having a nucleotide sequence capable of hybridizing 
with the isolated nucleic acids of the present invention, 

15 The core 2 J31->6 W-acetylglucosaminyltransf erase and 
leukosialin amino acid and nucleic acid sequences disclosed 
herein can be purified from human cells or produced using 
well known methods of recombinant DNA technology. 

The invention also discloses a method of 
2 0 isolating nucleic acid sequences encoding proteins that 
have an enzymatic activity. Such a nucleic acid sequence 
is obtained by transfecting the nucleic acid, which is 
contained within a vector having a polyoma virus 
replication origin, into a Chinese hamster ovary (CHO) cell 
25 line simultaneously expressing polyoma virus large T 
antigen and the acceptor molecule for the protein having an 
enzymatic activity. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 depicts the structures and biosynthesis 
30 of O-glycans. Structures of O-glycan cores can be 
classified into 4 groups (core 1 to core 4), each of which 
is synthesized starting with GalNAcal->Ser/Thr . The core 1 
structure is synthesized by the addition of a Bl->3 Gal 
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residue to the GalNAc residue. The core 1 structure can be 
converted to core 2 by the addition of a Bl-»€ N- 
acetylglucosaminyl residue. This intermediate is ustally 
converted to the hexasaccharide by sequential addition of 
5 galactose and sialic acid residues (bottom right). The 
core 2 £l-*6 W-acetylglucosaminyltransf erase and the liikage 
formed by the enzyme are indicated by a box. In certain 
cell types , the core 2 structure can be extended by the 
addition of N-acetyllactosamine (GalJ31-*4GlcNAcJ31-*3 ) repeats 

10 to form poly-//-acetyllactosamine. In the absence of core 
2 J31->6 N-acetylglucosaminyltransf erase, core 1 is converted 
to the monosialof orm, then to the disialof orm by sequential 
addition of a2-*3- and cr2->6-linked sialic acid residues 
(bottom left) . Alternatively, core 3 can be synthesized by 

15 the addition of a M->3 W-acetylglucosaminyl residue to the 
GalNAc residue. Core 3 can be converted to core 4 by 
another J31-*6 N-acetylglucosaminyltransf erase (top of 
figure ) . 

Figure 2 depicts genomic DNA sequence (SEQ. ID. 

20 NO. 1) and cDNA sequence (SEQ. ID. NO. 2) of leukosialin. 
The genomic sequence is numbered relative to the 
transcriptional start site. Exon 1 and exon 2 have been 
previously described. Exon 1' is newly identified here. 
In the isolated cDNA, exon 1 ' is immediately followed by 

25 the exon 2 sequence. Deduced amino acids are presented 
under the coding sequence, which begins in exon 2 (SEQ. ID. 
NO. 3). A portion of the exon 2 sequence is shown. 

Figure 3 establishes the ability of pGT/hCG to 
replicate in CHO cell lines expressing polyoma large T 

30 antigen and leukosialin. In panel A, six clonal CHO cell 
lines were examined for replication of pcDNAI-based pGT/hCG 
(lanes 1-6). In panel B, replication of cell clone 5 (CHO- 
Py-leu), was further examined by treatment with increasing 
concentrations of Dpnl and Xhol (lanes 2 and 3). Plasmid 

35 DNA isolated from MOP-8 cells was used as a control (lane 
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1). Plasmid DNA was extracted using the Hirt procedure and 
samples were digested with Xhol and DpnI. In parallel , 
pGT/hCG plasmid purified from E . coll MC1061/P3 was 
digested with Xhol and DpnI (lane 7 in panel A and lane 4 
5 in panel B) or Xhol alone (lane 8 in panel A and lane 5 in 
panel B) . The arrow indicates the migration of plasmid DNA 
resistant to DpnI digestion. The arrowheads indicate 
plasmid DNA digested by DpnI. 

Figure 4 shows the expression of T305 antigen 
10 expressed by pcDNAI-C2GnT. Subconfluent CHO-Py-leu cells 
were transfected with pcDNAI-C2GnT (panels A and B) or 
mock-transf ected with pcDNAI (panels C and D) . Sixty four 
hours after transf ection , the cells were fixed, then 
incubated with mouse T305 monoclonal antibody followed by 
15 fluorescein isocyanate-con jugated sheep anti-mouse IgG 
(panels A, B and C) . Two different areas are shown in 
panels A and B. Panel D shows a phase micrograph of the 
same field shown in panel C. Bar = 20jim. 

Figure 5 depicts the cDNA sequence (SEQ. ID. NO. 

20 4) and translated amino acid sequences (SEQ. ID. NO. 5) of 
core 2 Bl->6 W-acetylglucosaminyl transf erase . The open 
reading frame and full-length nucleotide sequence of C2GnT 
are shown. The signal /membrane- anchoring domain is doubly 
underlined. The polyadenylation signal is boxed. 

25 Potential W-glycosylation sites are marked with asterisks. 
The sequences are numbered relative to the translation 
start site. 

Figure 6 shows the expression of core 2 Bl->6 N- 
acetylglucosaminyltransf erase mRNA in various cell types. 
30 Poly(A) + RNA (11 yg) from CHO-Py-leu cells (lane 1), HL-60 
promyelocytes (lane 2), K562 erythrocytic cells (lane 3), 
and SP and L4 colonic carcinoma cells (lanes 4 and 5) was 
resolved by electrophoresis* RNA was transferred to a 
nylon membrane and hybridized with a radiolabeled fragment 
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of pPR0TA-C2GnT. Migration of RNA size markers is 
indicated. 



Figure 7 illustrates the construction of the 
vector encoding the protein A-C2GnT fusion protein. The 
5 cDNA sequence corresponding to Pro 38 to His 428 was fused in 
frame with the IgG binding domain of S. aureus protein A 
(bottom? SEQ. ID. NO. 6). The sequence includes the 
cleavable signal peptide, which allows secretion of the 
fused protein. The coding sequence is under control of the 
10 SV40 promoter. The remainder of the vector sequence shown 
was derived from rabbit B-globin gene sequences, including 
an intervening sequence ( IVS ) and a polyadenylation signal 
(An) . 



DETAILED DESCRIPTION OF THE INVENTION 

15 The present invention generally relates to a 

novel human core 2 J31-*6 AT-acetylglucosaminyltransf erase . 
The invention further relates to a novel method of 
transient expression cloning in CHO cells that was used to 
isolate the cDNA sequence encoding human core 2 J31->6 //- 

20 acetylglucosaminyltransferase (C2GnT) . The invention also 
relates to a novel human leukosialin, which is an acceptor 
molecule for core 2 £l-*6 N-acetylglucosaminyltransf erase 
activity. 

Cells generally contain extremely low amounts of 
25 glycosyltransf erases. As a result, cDNA cloning based on 
screening using an antibody or a probe based on the 
glycosyltransferase amino acid sequence has met with 
limited success. However, isolation of cDNAs encoding 
various glycosyltransf erases can be achieved by transient 
3 0 expression of cDNA in recipient cells. 

Successful application of the transient 
expression cloning method to isolate a cDNA sequence 
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encoding a glycosyltransf erase requires an appropriate 
recipient cell line. Ideal recipient cells should not 
express the glycosyltransf erase of interest* As a result, 
the recipient cells would normally lack the oligosaccharide 
5 structure formed by such a glycosyltransf erase. 

Expression of the cloned glycosyltransf erase cDNA 
in the recipient cell line should result in formation of 
the specific oligosaccharide structure. The resultant 
oligosaccharide can be identified using a specific antibody 
10 or lectin that recognizes the structure. The recipient 
cell line also must support replication of an appropriate 
plasmid vector. 

COS-1 cells initially appear to satisfy the 
requirements for using the transient expression method. 

15 COS-1 cells express SV40 large T antigen and support the 
replication of plasmid vectors harboring a SV4 0 replication 
origin (Gluzman et al. # Cell 23:175-182 (1981)). Although 
COS-1 cells, themselves, express a variety of 
glycosyltransf erases , COS-1 cells have been used to clone 

20 cDNA sequences encoding human blood group Lewis al-»3/4 
f ucosyltransf erase and murine al-*3 galactosyltransf erase 
(Kukowska-Latallo et al., Genes and Devel. 4:1288-1303 
(1990); Larsen et al., Proc. Natl. Acad. Sci. USA 86:8227- 
8231 (1989)). Also, Goelz et al., Cell 63:175-182 (1990), 

25 utilized an antibody that inhibits E-selectin mediated 
adhesion to isolate a cDNA sequence encoding al-*3 
f ucosyltransf erase. 

An attempt was made to use COS-1 cells to isolate 
cDNA clones encoding core 2 + 6 N- 

30 acetylglucosaminyltransf erase. COS-1 cells were 

transfected using cDNA obtained from activated human T 
cells, which express the core 2 fil+6 N- 
acetylglucosaminyltransf erase. Transfected cells suspected 
of expressing core 2 fil-*6 ^-acetylglucosaminyltransf erase 
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in the transfected cells were identified by the presence of 
increased levels of the core 2 oligosaccharide structure 
formed by core 2 J31-*6 N-acetylglucosaminyltransf eiase 
activity. The presence of the core 2 structure was 
5 identified using the monoclonal antibody, T305, which 
identifies a hexasaccharide on leukosialin. A clone 
expressing high levels of the T305 antigen was isolated and 
sequenced* 

Surprisingly, transfection using COS-1 cells 
10 resulted in the isolation of a cDNA clone encoding a novel 
variant of human leukosialin, which is the acceptor 
molecule for core 2 fll->6 W-acetylglucosaminyltransf erase 
activity. Examination of the cDNA sequence of the newly 
isolated leukosialin revealed the cDNA sequence was formed 
15 as a result of alternative splicing of exons in the genomic 
leukosialin DNA sequence. Specifically, the newly isolated 
leukosialin is encoded by cDNA sequence containing a 
previously undescribed non-coding exon at the 5 '-terminus 
(exon 1' in Figure 2; SEQ. ID. NO. 1 and SEQ. ID. NO. 2). 

20 The unexpected result obtained using COS-1 cells 

led to the development of a new transfection system to 
isolate a cDNA sequence encoding core 2 Jll-»6 N- 
acetylglucosaminyltransf erase. CHO cells, which do not 
normally express the T305 antigen, were transfected with 

25 DNA sequences encoding human leukosialin and the polyoma 
virus large T antigen. A cell line, designated CHO-Py-leu, 
which expresses human leukosialin and polyoma virus large 
T antigen, was isolated. 

CHO-Py-leu cells were used for transient 
30 expression cloning of a cDNA sequence encoding core 2 Bl-*6 
W-acetylglucosaminyltransf erase, CHO-Py-leu cells were 
transfected with cDNA obtained from human HL-60 
promyelocytes • A plasmid, pcDNAI-C2Gnt , which directed 
expression of the T305 antigen, was isolated and the cDNA 
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insert was sequenced (see Figure 5; SEQ. ID. NO. 4). The 
2105 base pair cDNA sequence encodes a putative 428 amino 
acid protein. The genomic DNA sequence encoding can be 
isolated using methods well known to those skilled in the 
5 art, such as nucleic acid hybridization using the core 2 
fll-*6 2V-acetylglucosaminyltransferase cDNA disclosed herein 
to screen , for example, a genomic library prepared from HL- 
60 promyelocytes. 

An enzyme similar to the disclosed human core 2 
10 Bl->6 JV-acetylglucosaminyltransf erase has been purified from 
bovine tracheal epithelium (Ropp et al., J. Biol. Chem. 
266:23863-23871 (1991), which is incorporated herein by 
reference. The apparent molecular weight of the bovine 
enzyme is ~69kDa. In comparison, the predicted molecular 
15 weight of the polypeptide portion of core 2 fil->6 N- 
acetylglucosaminyltransf erase is ~50kDa. The deduced amino 
acid sequence of core 2 J3 1 -» 6 N- 
acetylglucosaminyltransf erase reveals two to three 
potential W-glycosylation sites , suggesting W-glycosylation 
20 and O-glycosylation, or other post-translational 
modification, could account for the larger apparent size of 
the bovine enzyme. 

Expression of the cloned C2GnT sequence, or a 
fragment thereof, directed formation of the specific O- 

25 glycan core 2 oligosaccharide structure. Although several 
cDNA sequences encoding glycosyltransf erases have been 
isolated (Paulson and Colley, J. Biol. Chem. 264:17615- 
17618 (1989); Schachter, Curr. Qpin. Struct. Biol. 1:755- 
765 (1991), which are incorporated herein by reference), 

30 C2GnT is the first reported cDNA sequence encoding an 
enzyme involved exclusively in O-glycan synthesis. 

In O-glycans, J31-*6 iV-acetylglucosaminyl linkages 
may occur in both core 2, GalJll+3 (GlcNAcfll-*6 ) GalNAc, and 
core 4, GlcNAcG 3 (GlcNAcB l-*6 ) GalNAc, structures 
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(Brockhausen et al., Biochemistry 24:1866-1874 (1985), 
which is incorporated herein by reference. In addition, 
fll->6 W-acetylglucosaminyl linkages occur in the side chains 
of poly-tf-acetyllactosamine, forming the I-structure 
5 (Piller et al., J. Biol. Cheiru 259:13385-13390 (1984), 
which is incorporated herein by reference), and in the side 
chain attached to a-mannose of the JV-glycan core structure, 
forming a % tetraantennary saccharide (Cummings et al., J. 
Biol. Chem. 257:13421-13427 (1982), which is incorporated 
10 herein by reference). The enzymes responsible for these 
linkages all share the unique property that Mn 2 + is not 
required for their activity. 

Although it was originally suggested that these 
Bl+6 N-acetylglucosaminyl linkages were formed by the same 

15 enzyme (Piller at al., 1984), the present disclosure 
clearly demonstrates that the HL-60-derived core 2 Bl-»6 N- 
acetylglucosaminyltransferase is specific for the formation 
only of O-glycan core 2. This result is consistent with a 
recent report demonstrating that myeloid cell lysates 

2 0 contain the enzymatic activity associated with core 2, but 
not core 4, formation (Brockhausen et al., supra , (1991)). 

Analysis of mRNA isolated from colonic cancer 
cells indicated core 2 J31-»6 W-acetylglucosaminyltransf erase 
is expressed in these cells. Recent studies using affinity 

25 absorption suggested at least two different fil-*6 N- 
acetylglucosaminyltransf erases were present in tracheal 
epithelium (Ropp et al., supra , (1991)). One of these 
transferases formed core 2, core 4, and I structures. 
Thus, at least one other R 1 -* 6 N- 

30 acetylglucosaminyltransferase present in epithelial cells 
can form core 2, core 4 and I structures. Similarly, a 
fll-»6 tf-acetylglucosaminyltransferase present in Novikoff 
hepatoma cells can form both core 2 and I structures 
(Koenderman et al., Eur. J. Biochem. 166:199-208 (1987), 

35 which is incorporated herein by reference). 
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The acceptor molecule specificity of core 2 Bl->6 
N-acetylglucosaminyltransf erase is different from the 
specificity of the enzymes present in tracheal epithelium 
and Novikoff hepatoma cells. Thus, a family of Bl-*6 N- 
5 acetylglucosaminyltransf erases can exist, the members of 
which differ in acceptor specificity but are capable of 
forming the same linkage. Members of this family are 
isolated from cells expressing B 1 -> 6 N- 
acetylglucosaminyltransf erase activity, using , for example, 
10 nucleic acid hybridization assays and studies of acceptor 
molecule specificity. Such a family was reported for the 
al+3 fucosyltransf erases (Weston et al., J. Biol. Chem. 
267:4152-4160 (1992), which is incorporated herein by 
reference) . 

15 The formation of the core 2 structure is critical 

to cell structure and function. For example , the core 2 
structure is essential for elongation of poly-N- 
acetyllactosamine and for formation of sialyl Le x or sialyl 
Le* structures. Furthermore, the biosynthesis of cartilage 

20 keratan sulfate may be initiated by the core 2 Jll-*6 N- 
acetylglucosaminyltransf erase, since the keratan sulfate 
chain is extended from a branch present in core 2 structure 
in the same way as poly-N-acetyllactosamine (Dickenson et 
al., Biochem. J. 269:55-59 (1990), which is incorporated 

25 herein by reference) . Keratan sulfate is absent in wild- 
type CHO cells, which do not express the core 2 fll-»6 N- 
acetylglucosaminyltransf erase (Esko et al., J. Biol. Chem. 
261:15725-15733 (1986), which is incorporated herein by 
reference) . These structures are believed to be important 

30 for cellular recognition and matrix formation. The 
availability of the cDNA clone encoding the core 2 JJl-*6 W- 
acetylglucosaminyltransf erase will aid in understanding how 
the various carbohydrate structures are formed during 
differentiation and malignancy. Manipulation of the 

35 expression of the various carbohydrate structures by gene 
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30 



transfer and gene inactivation methods will help elucidate 
the various functions of these structures. 

The present invention is directed to a method for 
transient expression cloning in CHO cells of cDNA sequences 
encoding proteins having enzymatic activity. Isolation of 
human core 2 J3L->6 N-acetylglucosaminyltransf erase is 
provided .as an example of the disclosed method. However, 
the method can be used to obtain cDNA sequences encoding 
other proteins having enzymatic activity. 



For example, lectins and antibodies reactive with 
other specific oligosaccharide structures are available and 
can be used to screen for glycosyltransf erase activity. 
Also, CHO cell lines that have defects in glycosylation 
have been isolated. These cell lines can be used to study 
15 the activity of the corresponding glycosyltransf erase 
(Stanley, Ann. Rev. Genet. 18:525-552 (1984), which is 
incorporated herein by reference). CHO cell lines also 
have been selected for various defects in cellular 
metabolism, loss of expression of cell surface molecules 
20 and resistance to cytotoxic drugs (see, for example, 
Malmstrom and Krieger, J. Biol. Chem. 266:24025-24030 
(1991); Yayon et al. f Cell 64:841-848 (1991), which are 
incorporated herein by reference). The approach disclosed 
herein should allow isolation of cDNA sequences encoding 
25 the proteins involved in these various cellular functions. 

As used herein, the terms "purified" and 
"isolated" mean that the molecule or compound is 
substantially free of contaminants normally associated with 
a native or natural environment. For example, a purified 
protein can be obtained from a number of methods. The 
naturally-occurring protein can be purified by any means 
known in the art, including, for example, by affinity 
purification with antibodies having specific reactivity 
with the protein. In this regard, anti-core 2 J31-6 W- 
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acetylglucosaminyltransf erase antibodies can be used to 
substantially purify naturally-occurring core 2 fil-*6 W- 
acetylglucosaminyltransf erase from human HL-60 
promyelocytes . 

5 Alternatively, a purified protein of the present 

invention can be obtained by well known recombinant 
methods , -utilizing the nucleic acids disclosed herein, as 
described, for example, in Sambrook et al . , Molecular 
Cloning: A Laboratory Manual 2d ed. (Cold Spring Harbor 
10 Laboratory 1989) , which is incorporated herein by 
reference, and by the methods described in the Examples 
below. Furthermore, purified proteins can be synthesized 
by methods well known in the art. 

As used herein, the phrase "substantially the 

15 sequence" includes the described nucleotide or amino acid 
sequence and sequences having one or more additions , 
deletions or substitutions that do not substantially affect 
the ability of the sequence to encode a protein have a 
desired functional activity. In addition , the phrase 

20 encompasses any additional sequence that hybridizes to the 
disclosed sequence under stringent hybridization sequences. 
Methods of hybridization are well known to those skilled in 
the art. For example, sequence modifications that do not 
substantially alter such activity are intended. Thus, a 

25 protein having substantially the amino acid sequence of 
Figure 5 (SEQ. ID. NO. 5) refers to core 2 fil-*6 N- 
acetylglucosaminyltransf erase encoded by the cDNA described 
in Example IV, as well as proteins having amino acid 
sequences that are modified but, nevertheless, retain the 

30 functions of core 2 Bl->6 N- acetylglucosaminyltransf erase. 
One skilled in the art can readily determine such retention 
of function following the guidance set forth, for example, 
in Examples V and VI. 
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The present invention is further directed to 
active fragments of the human core 2 fll-*6 N- 
acetylglucosaminyltransf erase protein. As used herein, an 
active fragment refers to portions of the protein that 
5 substantially retain the glycosyltransf erase activity of 
the intact core 2 fll->6 tf-acetylglucosaminyltransf erase 
protein. One skilled in the art can readily identify 
active fragments of proteins such as core 2 Al-*6 W- 
acetylglucosaminyltransf erase by comparing the activities 
10 of a selected fragment with the intact protein following 
the guidance set forth in the Examples below. 

As used herein, the term " glycosyltransf erase 
activity" refers to the function of a glycosyltransf erase 
to link sugar residues together through a glycosidic bond 

15 to create critical branches in oligosaccharides. 
Glycosyltransf erase activity results in the specific 
transfer of a monosaccharide to an appropriate acceptor 
molecule, such that the acceptor molecule contains 
oligosaccharides having critical branches. One skilled in 

20 the art would understand the terms "enzymatic activity" and 
"catalytic activity" to generally refer to a function of 
certain proteins, such as the function of those proteins 
having glycosyltransf erase activity. 

As used herein, the term "acceptor molecule" 
25 refers to a molecule that is acted upon by a protein having 
enzymatic activity. For example, an acceptor molecule, 
such as leukosialin, as identified by the amino acid 
sequence of Figure 2 (SEQ. ID. NO. 3), accepts the transfer 
of a monosaccharide due to glycosyltransf erase activity. 
30 An acceptor molecule, such as leukosialin, may already 
contain one or more sugar residues. The transfer of 
monosaccharides to an acceptor molecule, such as 
leukosialin, results in the formation of critical branches 
of oligosaccharides. 
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As used herein, the term "critical branches" 
refers to oligosaccharide structures formed by specific 
glycoeyltransf erase activity. Critical branches may be 
involved in various cellular functions , such as cell-cell 
5 recognition. The oligosaccharide structure of a critical 
branch can be determined using methods well known in the 
art, such as the method for determining the core 2 
oligosaccharide structure, as described in Examples V and 
VI. 

10 Relatedly, the invention also provides nucleic 

acids encoding the human core 2 fil-*6 N- 
acetylglucosaminyltransf erase protein and leukosialin 
protein described above. The nucleic acids can be in the 
form of DNA, RNA or cDNA, such as the novel C2GnT cDNA of 

15 2105 base pairs identified in Figure 5 (SEQ. ID. NO. 4) or 
the novel leukosialin cDNA identified in Figure 2 (SEQ. ID. 
NO. 2), for example. Such nucleic acids can also be 
chemically synthesized by methods known in the art, 
including, for example, the use of an automated nucleic 

20 acid synthesizer. 

The nucleic acid can have substantially the 
nucleotide sequence of C2GnT, identified in Figure 5 (SEQ. 
ID. NO. 4), or leukosialin identified in Figure 2 (SEQ. ID. 
NO. 2). Portions of such nucleic acids that encode active 
25 fragments of the core 2 J3 1 -» 6 N - 
acetylglucosaminyltransf erase protein or leukosialin 
protein of the present invention also are contemplated. 

Nucleic acid probes capable of hybridizing to the 
nucleic acids of the present invention under reasonably 
30 stringent conditions can be prepared from the cloned 
sequences or by synthesizing oligonucleotides by methods 
known in the art. The probes can be labeled with markers 
according to methods known in the art and used to detect 
the nucleic acids of the present invention- Methods for 
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detecting such nucleic acids can be accompli±ed by 
contacting the probe with a sample containing or suspected 
of containing the nucleic acid under hybridizing 
conditions, and detecting the hybridization of the probe to 
5 the nucleic acid. 

The present invention is further directed to 
vectors containing the nucleic acids described above. The 
term "vector" includes vectors that are capable of 
expressing nucleic acid sequences operably linked to 

10 regulatory sequences capable of effecting their expression. 
Numerous cloning vectors are known in the art. Thus, the 
selection of an appropriate cloning vector is a natter of 
choice. In general , useful vectors for recombinant DNA are 
often plasmids, which refer, to circular double stranded DNA 

15 loops such as pcDNAI or pcDSRa. As used herein, "plasmid" 
and "vector" may be used interchangeably as the plasmid is 
a common form of a vector. However, the invention is 
intended to include other forms of expression vectors that 
serve equivalent functions. 

20 Suitable host cells containing the vectors of the 

present invention are also provided. Host cells can be 
transformed with a vector and used to express the desired 
recombinant or fusion protein. Methods of recombinant 
expression in a variety of host cells, such as manmalian, 

25 yeast, insect or bacterial cells are widely known. For 
example, a nucleic acid encoding core 2 Jil->6 N- 
acetylglucoaaminyltransf erase or a nucleic acid encoding 
leukosialin can be transfected into cells using the calcium 
phosphate technique or other transfection methods, such as 

30 those described in Sambrook et al., supra , (1989). 

Alternatively, nucleic acids can be introduced 
into cells by infection with a retrovirus carrying the gene 
or genes of interest. For example, the gene can be cloned 
into a plasmid containing retroviral long terminal repeat 
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sequences, the C2Gnt DNA sequence or the leukosialin DNA 
sequence, and an antibiotic resistance gene for selection. 
The construct can then be transfected into a suitable cell 
line, such as PA12, which carries a packaging deficient 
5 provirus and expresses the necessary components for virus 
production, including synthesis of amphotrophic 
glycoproteins. The supernatant from these cells contain 
infectious- virus, which can be used to infect the cells of 
interest. 

10 isolated recombinant polypeptides or proteins can 

be obtained by growing the described host cells under 
conditions that favor transcription and translation of the 
transfected nucleic acid. Recombinant proteins produced by 
the transfected host cells are isolated using methods set 

15 forth herein and by methods well known to those skilled in 
the art. 

Also provided are antibodies having specific 
reactivity with the core 2 J31->6 N- 
acetylglucosaminyltransferase protein or leukosialin 

2 0 protein of the present invention. Active fragments of 
antibodies, for example, Fab and Fab' 2 fragments, having 
specific reactivity with such proteins are intended to fall 
within the definition of an "antibody." Antibodies 
exhibiting a titer of at least about 1.5 x 10 s , as 

25 determined by ELISA, are useful in the present invention. 

The antibodies of the invention can be produced 
by any method known in the art. For example, polyclonal 
and monoclonal antibodies can be produced by methods 
described in Harlow and Lane, Antibodies t A Laboratory 
30 Manual (Cold Spring Harbor 1988), which is incorporated 
herein by reference. The proteins, particularly core 2 
fil-*6 N-acetylglucosaminyltransf erase or leukosialin of the 
present invention can be used as immunogens to generate 
such antibodies. Altered antibodies, such as chimeric, 
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huaanized, CDR-grafted or bifunctional antibodies can also 
be produced by methods well known to those skilled in the 
art. Such antibodies can also be produced by hybridoma, 
chemical synthesis or recombinant methods described, for 
5 example, in Sambrook et al., supra , (1989). 

The antibodies can be used for determining the 
presence . or purification of the core 2 J31->6 N- 
acetylglucosaminyltransferase protein or the leukosialin 
protein of the present invention. With respect to the 
0 detecting of such proteins, the antibodies can be used for 
in vitro or in vivo methods well known to those skilled in 
the art. 



Finally, kits useful for carrying out the methods 
of the invention are also provided. The kits can contain 
15 a core 2 B1^6 W-acetylglucosaminyltransf erase protein, 
antibody or nucleic acid of the present invention and an 
ancillary reagent. Alternatively, the kit can contain a 
leukosialin protein, antibody or nucleic acid of the 
present invention and an ancillary reagent. An ancillary 
reagent may include diagnostic agents, signal detection 
systems, buffers, stabilizers, pharmaceutical ly acceptable 
carriers or other reagents and materials conventionally 
included in such kits. 



20 



A cDNA sequence encoding core 2 JJl-*6 N- 
25 acetylglucosaminyltransf erase was isolated and core 2 Bl->6 
W-acetylglucosaminyltransferase activity was determined. 
This is the first report of transient expression cloning 
using CHO cells expressing polyoma large T antigen. The 
following examples are intended to illustrate but not limit 
30 the present invention. 
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EXAMPLE I 

EXPRESSION CLONING IN COS-1 CELLS OF THE cDNA FOR THE 
PROTEIN CARRYING THE HEXASACCHARIPES 

COS-1 cells were transfected with a cDNA library, 
5 pcDSRa-2Fl, constructed from poly (A)**" RNA of activated T 
lymphocyte's, which express the core 2 Jil-»6 W- 
acetylglucosaminyltransf erase (Yokota et al., Proc. Natl. 
Acad. Sci. USA 83:5894-5898 (1986); Filler et al., supra, 
(1988), which are incorporated herein by reference). COS-1 

10 cells support replication of the pcDSRa constructs, which 
contain the SV40 replication origin, Transfected cells 
were selected by panning using monoclonal antibody T305, 
which recognizes sialylated branched hexasaccharides 
(Piller et al., supra , (1991); Saitoh et al. , supra , 

15 (1991)). Methods referred to in this example are described 
in greater detail in the examples that follow. 

Following several rounds of transf ection, one 
plasmid, pcDSRa-leu, directing high expression of the T305 
antigen was identified. The cloned cDNA insert was 

20 isolated and sequenced, then compared with other reported 
sequences. The newly isolated cDNA sequence was nearly 
identical to the sequence reported for leukosialin, except 
the 5 '-flanking sequences were different (Pallant et al., 
Proc. Natl. Acad. Sci. USA 86:1328-1332 (1989), which is 

25 incorporated herein by reference) • 



Comparison of the cloned cDNA sequence with the 
genomic leukosialin DNA sequence revealed the start site of 
the cDNA sequence is located 259 bp upstream of the 
transcription start site of the previously reported 
30 sequence (Figure 2; compare Exon 1' and Exon 1) (Shelley et 
al., Biochem. J. 270:569-576 (1990); Kudo and Fukuda, J^. 
Biol. Chem. 266:8483-8489 (1991), which are incorporated 
herein by reference). A consensus splice site was 
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identified at the exon-intron junction of the newly 
identified 122 bp exon 1' in pcDSRa-leu (Breathnact and 
Chambon, Ann. Rev. Biochem. 50:349-383 (1981), which is 
incorporated herein by reference). This splice site is 
5 followed by the exon 2 sequence. 



10 



These results indicate the T305 antibody 
preferentially binds to branched hexasaccharides attached 
to leukosialin. Indeed, a small amount of the 

hexasaccharides (approximately 8% of the total) was 
detected in O-glycans isolated from control COS-1 cells. 
T305 binding is similar to anti-M and anti-N antibodies, 
which recognize both the glycan and polypeptide portions of 
erythrocyte glycoprotein, glycophorin (Sadler et al. , 
Biol. Chem 254: 2112-2119 (1979), which is incorporated 
herein by reference). These observations are consistent 
with reports that only leukosialin strongly reacted with 
T305 in Western blots of leukocyte cell extracts, even 
though leukocytes also express other glycoproteins, such as 
CD4 5, that must also contain the same hexasaccharides 
20 (Piller et al., supra, (1991); Saitoh et al., supra, 
(1991) ) . 



15 



EXAMPLE II 



ESTABLISHMENT OF CHO CELL LINES THAT STABLY EXPRESS 
POLYOMA VTRITS T ARGE T ANTIGEN AND LEUKOSIALIN 

25 T30s preferentially binds to branched 

hexasaccharides attached to leukosialin. Such 
hexasaccharides are not present on the erythropoietin 
glycoprotein produced in CHO cells, although the 
glycoprotein does contain the precursor tetrasaccharide 

30 (Sasaki et al., J. Biol, chem. 262:12059-12076 (1987), 
which is incorporated herein by reference). T305 antigen 
also is not detectable in CHO cells transiently transfected 
with pcDSRa-leu. m order to screen for the presence of a 
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cDNA clone expressing core 2 fll-*6 N- 
acetylglucosaminyltransf erase activity, a CHO cell line 
expressing both leukosialin and polyoma large T antigen was 
established (see, for example, Heffernan and Dennis Nucl. 
5 Acids Res. 19:85-92 (1991), which is incorporated herein by 
reference) . 

Vectors; * A plasmid vector, pPSVEl-PyE, which contains 
the polyoma virus early genes under the control of the SV4 0 
early promoter, was constructed using a modification of the 
10 method of Muller et al. f Mol . Cell. Biol. 4:2406-2412 
(1984) , which is incorporated herein by reference. Plasmid 
pPSVEl was prepared using pPSG4 (American Type Culture 
Collection 37337) and SV40 viral DNA (Bethesda Research 
laboratories) essentially as described by Featherstone et 
15 al., Nucl. Acids Res. 12:7235-7249 (1984), which is 
incorporated herein by reference. Following EcoRI and 
Hindi digestion of plasmid pPyLT-1 (American Type Culture 
Collection 41043), a DNA sequence containing the carboxy 
terminal coding region of polyoma virus large T antigen was 
20 isolated. The Hindi site was converted to an EcoRI site 
by blunt-end ligation of phosphorylated EcoRI linkers 
(Stratagene) . Plasmid pPSVEl-PyE was generated by 

inserting the carboxy-terminal coding sequence for large T 
antigen into the unique EcoRI site of plasmid pPSVEl. 

25 Plasmid pZIPNEO-leu was constructed by 

introducing the EcoRI fragment of PEER- 3 cDNA, which 
contains the complete coding sequence for human 
leukosialin, into the unique EcoRI site of plasmid pZIPNEO 
(Cepko et al., Cell 37:1053-1063 (1984), which is 

30 incorporated herein by reference). Plasmid structures were 
confirmed by restriction mapping and by sequencing the 
construction sites. pZIPNEO was kindly provided by Dr. 
Channing Der. 



Transf ecfcion : 



CHODG44 cells were grown in 100 mm tissue 
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culture plates. When the cells were 20% confluent, they 
were co-transf ected with a 1:4 molar ratio of pZIPNEO-leu 
and pPSVEl-PyE using the calcium phosphate technique 
(Graham and van der Eb, Virology 52:456-467 (1973), which 
5 is incorporated herein by reference) . Transfected cells 
were isolated and maintained in medium containing 400 pq/ml 
G-418 (active drug), 

Leukosialin expression: The total pool of G418- 

resistant transf ectants was enriched for human leukosialin 

10 expressing cells by a one-step panning procedure using 
anti-leukosialin antibodies and goat anti-rabbit IgG coated 
panning dishes (Sigma) (Carlsson and Fukuda J. Biol, Chem. 
261:12779-12786 (1986), which is incorporated herein by 
reference). Clonal cell lines were obtained by limiting 

15 dilution. Six clonal cell lines expressing human 

leukosialin on the cell surface were identified by indirect 
immunofluorescence and isolated for further studies 
(Williams and Fukuda J. Cell Biol. 111:955-966 (1990), 
which is incorporated herein by reference) . 

20 Polyoma virus-mediated replication: The ability of the 

six clonal cell lines to support polyoma virus large T 
antigen-mediated replication of plasmids was assessed by 
determining the methylation status of transfected plasmids 
containing a polyoma virus origin of replication (Muller at 

25 al., supra , 1984; Heffernan and Dennis, supra , 1991). 
Plasmid pGT/hCG contains a fused £l-*4 galactosyltransf erase 
and human chorionic gonadotropin a-chain DNA sequence 
inserted in plasmid pcDNAI , which contains a polyoma virus 
replication origin (Aoki et al., Proc. Natl. Acad. Sci., 

30 USA 89, 4319-4323 (1992), which is incorporated herein by 
reference) . 

Plasmid pGT/hCG was isolated from methylase- 
positive E . coli strain MC1061/P3 ( Invitrogen) , which 
methylates the adenine residues in the Dpnl recognition 
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site, "GATC" . The methylated Dpnl recognition site is 
susceptible to cleavage by Dpnl, In contrast, the Dpnl 
recognition site of plasmids replicated in mammalian cells 
is not methylated and, therefore, is resistant to Dpnl 
5 digestion. 

Methylated plasmid pGT/hCG was transfected by 
lipof ection into each of the six selected clonal cell lines 
expressing leukosialin. After 64 hr, low molecular weight 
plasmid DNA was isolated from the cells using the method of 
10 Hirt, J. Mol. Biol. 26:365-369 (1967), which is 
incorporated herein by reference. Isolated plasmid DNA was 
digested with Xhol and Dpnl ( Stratagene) , subjected to 
electrophoresis in a 1% agarose gel, and transferred to 
nylon membranes (Micron Separations Inc., MA). 

15 A 0.4 kb Smal fragment of the J31->4 

galactosyltransf erase DNA sequence of pGT/hCG was 
radiolabeled with [ 32 P]dCTP using the random primer method 
(Feinberg and Vogelstein, Anal. Biochem. 132:6-13 (1983), 
which is incorporated herein by reference) . Hybridization 

2 0 was performed using methods well-known to those skilled in 
the art (see, for example, Sambrook et al . , supra , (1989)). 
Following hybridization, the membranes were washed several 
times, including a final high stringency wash in 0.1 x 
SSPE, 0.1% SDS for 1 hr at 65 °C, then exposed to Kodak X-AR 

25 film at -70°C. 

Four of the six clones tested supported 
replication of the pcDNAI-based plasmid, pGT/hCG (Fig. 
3. A., lanes 1, 3, 4 and 5). MOP-8 cells, a 3T3 cell line 
transformed by polyoma virus early genes (Muller et al., 
30 supra , (1984)), expresses endogenous core 2 fll->6 N- 
acetylglucosaminyltransf erase activity and was used as a 
control for the replication assay (Fig. 3.B. r lane 1). One 
clonal cell line that supported pGT/hCG replication, CHO- 
Py-leu (Fig. 3. A., lane 5; Fig. 3.B., lanes 2 and 3) and 
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expressed a significant amount of leukosialin, was selected 
for further studies. pGT/hCG was kindly provided by Dr. 
Michiko Fukuda. 



EXAMPLE III 



15 



5 ISOLATION OF A cD NA SEQUENCE DIRECTING EXPRESSION OF THE 

HEXASACCHARIDE ON LEUKOSIALIN 

Poly (A) + RNA was isolated from HL-60 
promyelocytes, which contain a significant amount of the 
core 2 J31->6tf-acetylglucosaminyl transferase (Saitoh et al., 
10 supra , (1991)). A cDNA expression library, pcDNAI-HL-60 , 
was prepared (Invitrogen) and the library was screened for 
clones directing the expression of the T305 antigen. 

Plasmid DNA from the pcDNAI-HL-60 cDNA library 
was transfected into CHO-Py-leu cells using a modification 
of the lipofection procedure, described below (Feigner et 
a1 -' Proc. Natl. Acad. Sci. USA 84:7413-7417 (1987), which 
is incorporated herein by reference). CHO-Py-leu cells 
were grown in 100 mm tissue culture plates. When the cells 
were 2 0% confluent, they were washed twice with Opti-MEM I 
20 (GIBCO). Fifty pq of lipofectin reagent (Bethesda Research 
Laboratories) and 20 pg of purified plasmid DNA were each 
diluted to 1.5 ml with Opti-MEM I, then mixed and added to 
the cells. After incubation for 6 hr at 37 °C, the medium 
was removed, 10 ml of complete medium was added and 
25 incubation was continued for 16 hr at 37 °C. The medium was 
then replaced with 10 ml of fresh medium. 

Following a 64 hr period to allow transient 
expression of the transfected plasmids, the cells were 
detached in PBS/5mM EDTA, pH7.4, for 30 min at 37 °C, 
30 pooled, centrifuged and resuspended in cold PBS/lOmM 
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EDTA/5% fetal calf serum, pH7.4, containing a 1:200 
dilution of ascites fluid containing T305 monoclonal 
antibody. The cells were incubated on ice for 1 hr, then 
washed in the same buffer and panned on dishes coated with 
5 goat anti-mouse IgG (Sigma) (Wysocki and Sato Proc. Natl, 
Acad, Sci. USA 75:2844-2848 (1978); Seed & Aruffo Proc . 
Natl, Acad. Sci. USA 84:3365-3369 (1987), which are 
incorporated herein by reference). T305 monoclonal 
antibody was kindly provided by Dr. R.I. Fox f Scripps 
10 Research Foundation, La Jolla, CA. 



Plasmid DNA was recovered from adherent cells by 
the method of Hirt, supra , (1967), treated with Dpnl to 
eliminate plasmids that had not replicated in transfected 
cells, and transformed into E. coli strain MC1061/P3. 

15 Plasmid DNA was then recovered and subjected to a second 
round of screening. E* coli transf ormants containing 
plasmids recovered from this second enrichment were plated 
to yield 8 pools of approximately 500 colonies each. 
Replica plates were prepared using methods well-known to 

20 those skilled in the art (see, for example, Sambrook et 
al. , supra , (1989) ) . 

The pooled plasmid DNA was prepared from replica 
plates and transfected into CHO-Py-leu cells. The 
transf ectants were screened by panning. One plasmid pool 

25 was selected and subjected to three subsequent rounds of 
selection. One plasmid, pcDNAI-C2GnT, which directed the 
expression of the T305 antigen, was isolated. CHO-Py-leu 
cells transfected with pcDNAI-C2GnT express the antigen 
recognized by T305, whereas CHO-Py-leu cells transfected 

30 with pcDNAI are negative for T305 antigen (Fig. 4). These 
results show pcDNAI-C2GnT directs the expression of a new 
determinant on leukosialin that is recognized by T305 
monoclonal antibody. This determinant is the branched 
hexasaccharide sequence, 

35 NeuNAca2-3Galfll->3 (NeuNAca2-*3GalJ31->4 GlcNAcf31-*6 ) GalNAc . 
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EXAMPLE IV 

CHARACTERIZATION OF C2GnT 

DNA sequence; The cDNA insert in plasmid pcDNAI-C2GnT 

was sequenced by the dideoxy chain termination method using 
5 Sequenase version 2 reagents (United States Biochemicals ) 
(Sanger et al., Proc. Natl. Acad. Sci. USA 74:5463-5467 
(1977), which is incorporated herein by reference). Both 
strands were sequenced using 17-mer synthetic 
oligonucleotides, which were synthesized as the sequence of 
10 the cDNA insert became known* 

Plasmid pcDNAI-C2GnT contains a 2105 base pair 
insert (Fig. 5). The cDNA sequence ends 1878 bp downstream 
of the putative translation start site, A polyadenylation 
signal is present at nucleotides 1694-1699, The 
15 significance of the large number of nucleotides between the 
polyadenylation signal and the beginning of the polyadenyl 
chain is not clear. However, this sequence is A/T rich. 

Deduced amino acid sequence: The cDNA insert in 

plasmid pcDNAI-C2GnT encodes a single open reading frame in 
20 the sense orientation with respect to the pcDNAI promoter 
(Fig. 5). The open reading frame encodes a putative 428 
amino acid protein having a molecular mass of 4 9,790 
daltons. 

Hydropathy analysis indicates the predicted 
25 protein is a type II transmembrane molecule, as are all 
previously reported mammalian glycosyltransf erases 
(Schachter, supra, (1991)). In this topology, a nine amino 
acid cytoplasmic Nonterminal segment is followed by a 23 
amino acid transmembrane domain flanked by basic amino acid 
30 residues. The large COOH-terminus consists of the stem and 
catalytic domains and presumably faces the lumen of the 
Golgi complex. 
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The putative protein contains three potential N- 
glycosylation sites (Fig. 5, asterisks). However , one of 
these sites contains a proline residue adjacent to 
asparagine and is not likely utilized in vivo . 

5 No matches were obtained when the C2GnT cDNA 

sequence and deduced amino acid sequence were compared with 
sequences' listed in the PC/Gene 6.6 data bank. In 
particular, no homology was revealed between the deduced 
amino acid sequence of C2GnT and other 
10 glycosyltransferases, including N - 
acetylglucosaminyltransf erase I (Sarkar et al., Proc. Natl. 
Acad. Sci. USA 88:234-238 (1991), which is incorporated 
herein by reference). 

mRNA expression: Poly (A) + RNA was prepared using a kit 

(Stratagene) and resolved by electrophoresis on a 1.2% 
agarose/2.2 M formaldehyde gel, and transferred to nylon 
membranes (Micro Separations Inc., MA) using methods well- 
known to those skilled in the art (see, for example, 
Sambrook et al. r supra, (1989)). Membranes were probed 
using the EcoRI insert of pPROTA-C2GnT (see below) 
radiolabeled with [ 32 P]dCTP by the random priming method 
(Feinberg and Vogelstein, supra , (1983). Hybridization was 
performed in buffers containing 50% fonnamide for 24 hr at 
42°C (Sambrook et al., supra , (1989)). Following 
hybridization, filters were washed several times in 
lxSSPE/0.1% SDS at room temperature and once in 
O.lxSSPE/0.1% SDS at 42°C, then exposed to Kodak X-AR film 
at -70°C. 

Fig. 6 compares the level of core 2 fll->6 W- 
30 acetylglucosaminyltransf erase mRNA isolated from HL-60 
promyelocytes, K562 erythroleukemia cells, and poorly 
metastatic SP and highly metastatic L4 colonic carcinoma 
cells. The major RNA species migrates at a size 
essentially identical to the -2.1 kb C2GnT cDNA sequence. 
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The same result is observed for HL-60 cells and the two 
colonic cell lines, which apparently synthesize the 
hexasaccharides. In addition, two transcripts of -3.3 kh 
and 5.4 kb in size were detected in these cell lines. The 
two larger transcripts may result from differential usage 
of polyadenylation signals. 

b No hybridization occurred with poly (A) + RNA 
isolated from K562 cells, which lack the hexasaccfaaride, 
but synthesize the tetrasaccharide (Carlsson et al . , supra , 
(1986)), which is incorporated herein by reference. 
Similarly, no hybridization was observed for poly (A)* RNA 
isolated from CHO-Py-leu cells (Fig. 6, lane 1). 



EXAMPLE V 
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20 



EXPRESSI ON OF ENZ YMATTHAT/T Y ACTIVE fil-»6 N~ 
ACETYLGLUCOSAMINYLTRANSFERASE 

In order to confirm that C2GnT cDNA encodes for 

core 2 J31-»6 tf-acetylglucosaminyltransf erase , enzymatic 

activity was examined in CHO-Py-leu cells transfected with 

pcDNAI or pcDNAI-C2GnT. Following a 64 hr period to allow 

transient expression, cell lysates were prepared and core 

2 JU-»6 N-acetylglucosaminyltransf erase activity was 
measured. 



tf-acetylglucosaminyltransf erase assays were 
performed essentially as described by Saitoh et al., supra, 
25 (1991), Yousefi et al., supra , (1991), and Lee et al., 

Biol. Chem. 265:20476-20487 (1990), which is incorporated 
herein by reference. Each reaction contained 50 mM MES, 
PH7.0, 0.5 j;Ci of UDP-[ 3 H]GlcNAc in 1 mM UDP-GlcNAc, 0.1 M 
GlcNAc, 10 mM Na 2 EDTA, lmM of acceptor and 25 pi of either 
cell lysate, cell supernatant or IgG-Sepharose matrix in a 
total reaction volume of 50 pi. 



30 
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Reactions were incubated for 1 hr at 37 °C, then 
processed by C18 Sep-Pak chromatography (Waters) (Palcic et 
al., .t. Rinl. Chem. 265:6759-6769 (1990), which is 
incorporated herein by reference). Core 2 and core 4 J31-6 
5 W -acetylglucosaminyltransferase were assayed using the 
acceptors p-nitrophenyl Galfil->3GalNAc and p-nitrophenyl 
GlcNAcBl-3GalNAc, respectively (Toronto Research 
Chemicals) * 



10 



UDP-GlcNAc:a-Man J3 1 -* 6 N 
acetylglucosaminyltransferase(V) was assayed using the 
acceptor GlcNAcJ31->2Manal-6Glc-J3-0-(CH 2 ) 7 CH 3 . The blood group 
I enzyme, UDP-GlcNAc :GlcNAcfll->3Galfil->4GlcNAc (GlcNAc to 
Gal) J31->6 w-acetylglucosaminyltransf erase, was assayed 
using GlcNAcfll-3GalBl-*4GlcNAciil-»6Manal-6ManJ31^0-(CH 2 ) 8 COOCH 3 
15 or G alfil-*4GlcNAcfil-3GalJ31^4GlcNAcJ31-3Galfll->4GlcNAcJil->0- 
(CH 2 ) 7 CH 3 as acceptors (Gu et al., J. Biol. Chem. 267:2994- 
2999 (1992), which is incorporated herein by reference). 
Synthetic acceptors were kindly provided by Dr. Ole 
Hindsgaul, University of Alberta, Canada. 



20 



Results of these assays are shown in Table I. 
Assuming transfection efficiency of the cells is 
approximately 20-30%, the level of enzymatic activity 
directed by cells transfected with P cDNAI-C2GnT is roughly 
equivalent to the level observed in HL-60 cells. 
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TABLE I 

Core 2 Bl-*6 N-acetylglucosaminyltransf erase activity in 
CHO-Py-leu cell extracts transfected with pcDNAI or 
pcDNAI-C2GnT. 



5 





Vector 


Core 2 iil->6 GlcNAc transferase 
activity (pmol/mg of protein/hr) 




pcDNAI 


n.d. 


10 


pcDNAI-C2GnT 


764 



CHO-Py-leu cells were transfected with pcDNAI or pcDNAI- 
C2GnT, as described in the specification. Endogenous 
activity was measured in the absence of acceptor and 

15 subtracted from values determined in the presence of added 
acceptor. Galfll->3GalNAca-p-nitrophenyl was used as an 
acceptor, n.d, = not detectable. For comparison, the core 
2 J31-»6 N-acetylglucosaminyltransf erase activity measured in 
HL-60 cells under identical conditions was 3228 pmol/mg of 

20 protein per hr. 

In order to unequivocally establish that C2GnT 
cDNA sequence encodes core 2 Al-*6 N- 
acetylglucosaminyltransf erase, plasnid, pPROTA-C2GnT was 
constructed containing the DNA sequence encoding the 

25 putative catalytic domain of core 2 J31-*6 N- 
acetylglucosaminyltransferase fused in frame with the 
signal peptide and IgG binding domain of S, aureus protein 
A (Fig. 7). The putative catalytic domain is contained in 
a 1330 bp fragment of the C2GnT cDNA that encodes amino 

30 acid residues 38 to 428. Plasmid pPROTA was kindly 
provided by Dr. John B. Lowe. 

The polymerase chain reaction (PCR) was used to 
insert EcoRI recognition sites on either side of the 1330 
bp sequence in pcDNAI-C2GnT DNA. PCR was performed using 
35 the synthetic oligonucleotide primers 5'- 
TTTGAATTCCCCTGAATTTGTAAGTGTCAGACAC-3 ' (SEQ. ID. NO. 6) and 
5 ' -TTTGAATTCGCAGAAACCATGCAGCTTCTCTGA-3 ' (SEQ. ID. NO. 7) 
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(EcoRI recognition sites underlined). The EcoRI sites 
allowed direct, in-frame insertion of the fragment into the 
unique EcoRI site of plasmid pPROTA (Sanchez-Lopez et al., 
J. Biol. Chem. 263:11892-11899 (1988), which is 
5 incorporated herein by reference) . 

The nucleotide sequence of the insert as well as 
the proper orientation were confirmed by DNA sequencing 
using the primers described above for cDNA sequencing. 
Plasmid pPR0TA-C2GnT allows secretion of the fusion protein 
10 from transfected cells and binding of the secreted fusion 
protein by insolubilized immunoglobulins. 

Either pPROTA or pPROTA-C2GnT was transfected 
into COS-1 cells. Following a 64 hr period to allow 
transient expression, cell supernatants were collected 

15 (Kukowska-Latallo et al., supra , (1990)). Cell 
supernatants were cleared by centrif ugation, adjusted to 
0.05% Tween 20 and either assayed directly for core 2 J31-»6 
N-acetylglucosaminyltransf erase activity or used in IgG- 
Sepharose (Pharmacia) binding studies. For the latter 

20 assay, supernatants (10 ml) were incubated batchwise with 
approximately 300 pi of IgG-Sepharose for 4 hr at 4°C. The 
matrices were then extensively washed and used directly for 
glycosyltransf erase assays . 

No core 2 Al->6 N-acetylglucosaminyltransf erase 
25 activity was detected in the medium of COS-1 cells 
transfected with the control plasmid, pPROTA. Similarly, 
no enzymatic activity was associated with IgG-Sepharose 
beads. In contrast, a significant level of core 2 Jll->6 N- 
acetylglucosaminyltransferase activity was detected in the 
30 medium of COS-1 cells transfected with pPR0TA-C2GnT. The 
activity also associated with the IgG-Sepharose beads 
(Table II). No activity was detected in the supernatant 
following incubation of the supernatant with IgG-Sepharose. 
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TABLE II 



Determination of Enzymatic Activities Directed by 
pPROTA-C2GnT. 



Acceptors and 
linkages formed 



Radioactivity ( cpm) 
with (+) and without 
(-) acceptor 



10 



GlcNAcBl 



GaUn-*3GalNAc 
(core 2-GnT) 



109 



1048 



15 



GlcNAcBl 

6 

GlcNAcfi l-*3GalNAc 
(core 4-GnT) 



111 



113 



20 



GlcNAcBl 

6 

GlcNAcfil->2Man 
(GnTV) 



118 



115 



25 



GlcNAcBl 

6 

GlcNAcBl->3Gal 
(I-GnT) 



111 



113 



30 



35 



GlcNAcBl 

6 

Galfll-MGlcNAcJ31->3Gal 
( I-GnT) 



99 



96 



COS-1 cells were transfected with pPROTA-C2GnT and the 
conditioned media were incubated with IgG-Sepharose . The 
proteins bound to the IgG-Sepharose were assayed for fll->6 
W-acetylglucosaminyltransferase activity by using 
appropriate acceptors. The linkages formed are indicated 
by italics. Similar results were obtained in three 
independent experiments. 
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EXAMPLE VI 

DETERMINATION OF C2GnT SPECIFICITY 

Four types of J3 1 -* 6 N - 
acetylglucosaminyltransferase linkages have been reported, 
5 including core 2 and core 4 in O-glycans, I-antigen and a 
branch attached to mannose that forms tetraantennary N- 
glycans (see Table II). In order to determine whether 
these different structures are also synthesized by the 
cloned C2GnT cDNA sequence, enzymatic activity was 
10 determined using five different acceptors. 



15 



20 



As shown in Table II, the fusion protein was only 
active with the acceptor for core 2 formation. The same 
was true when the formation of M->6 AT-acetylglucosaminyl 
linkage to internal galactose residues was examined (Table 
II, see structure at bottom). This result precludes the 
likelihood that the enzyme encoded by the C2GnT cDNA 
sequence may add N-acetylglucosamine to a non-reducing 
terminal galactose. The HL-60 core 2 B1-+6 N- 

acetylglucosaminyltransf erase is exclusively responsible 
for the formation of the GlcNAcfll-*6 branch on GalJU-3 
GalNAc . 



Although the invention has been described with 
reference to the disclosed embodiments, it should be 
understood that various modifications can be made without 
departing from the spirit of the invention. Accordingly, 
the invention is limited only by the following claims. 
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SEQUENCE LISTING 



(1) GENERAL I FORMATION : 

(i) APPLICANT: LA JOLLA CANCER RESEARCH FOUNDATION 



(ii) TITLE OF INVENTION: A NOVEL BETAl-6 

N-ACETYLGLUCOSAMINYL TRANSFERASE , ITS ACCEPTOR MOLECULE, 
LEUKOSIALIN AND A METHOD FOR CLONING PROTEINS HAVING 
ENZYMATIC ACTIVITY 

(iii) NUMBER OF. SEQUENCES: 8 

(iv) CORRESPONDENCE ADDRESS : 

(A) ADDRESSEE: CAMPBELL AND FLORES 

(B) STREET: 437 0 LA JOLLA VILLAGE DRIVE, SUITE 7 00 

(C) CITY: SAN DIEGO 

(D) STATE: CALIFORNIA 

(E) COUNTRY: USA 

(F) ZIP: 92122 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 30 September 1993 

(C) CLASSIFICATION: 

(Viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: KONSKI , ANTOINETTE F. 

(B) REGISTRATION NUMBER: 34,202 

(C) REFERENCE /DOCKET NUMBER: FP-LJ 9756 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 619-535-9001 

(B) TELEFAX: 619-535-8949 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 900 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) topology: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



( ix ) FEATURE : 

(A) NAME /KEY : CDS 

(B) LOCATION: 841. ,900 

(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 91. ,192 

(D) OTHER INFORMATION: /note= M EXON 1'IS LOCATED IN BOTH 
GENOMIC AND CDNA. IN THE CDNA EXON 1' IS 
IMMEDIATELY FOLLOWED BY EXON 2." 
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(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 359.. 428 

(D) OTHER INFORMATION: /note= "EXON 1 IS LOCATED IN 
GENOMIC DNA" 

(ix) FEATURE: 

(A) NAME /KEY : intron 

(B) LOCATION: 193.. 806 

(D) OTHER INFORMATION: /notes "THIS SEGMENT OF NUCLEIC 
ACID CONSTITUTES INTRON SEQUENCE OF THE cDNA M 

(ix) FEATURE: • 

(A) NAME /KEY: exon 

(B) LOCATION: 807.. 900 

(D) OTHER INFORMATION: /note= "EXON 2 IS LOCATED IN BOTH 
GENOMIC AND CDNA. IN THE cDNA EXON 2 IMMEDIATELY 
FOLLOWS EXON 1 ' . M 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

TTGGGGACCA CAAATGCAAA GGAAACCACC CTCCCCTCCC ACCTCCTCCT CTGCACCCTT 

GAGTTCTCAG GCTCACATTC CCACCACCCA CCTCTGAGCC CAGCCCTCCC TAGCATCACC 

ACTTCCATCC CATTCCTCAG CCAAGAGCCA GGAATCCTGA TTCCAGATCC CACGCTTCCC 

TGCCTCCCTC AGGTGAGCCC CAGACCCCCA GGCACCCCGC TGGCCCCTGA AGGAGCAGGT 240 

GATGGTGCTG TCTTCGCCCA GCAGCTGTGG GAG CAGGC GG GTGGGGCAGG ATGGAGGGGT 

GGGTGGGGTG GGTGGAGCCA GGGCCCACTT CCTTTCCCCT TGGGGCCCTG TCCTTCCCAG 

TCTTGCCCCA GCCTCGGGAG GTGGTGGAGT GACCTGGCCC CAGTGCTGCG TCCTTATCAG 

CCGAGCCGGT AAGAGGGTGA GACTTGGTGG GGT AGGGG C C TCAGTGGGCC TGGGAATGTG 

CCTGTGGCTT GAAAAGACTC TGACAGGTTA TGATGGGAAG AGATTGGGAG CCATTGGGCT 

GCACAGGGTC AGGGAAGGCC AGG AGGGG C T GGTCACTGCT GGAATCTAAG CTGCTGAGGC 

TGGAGGGAGC C T C AGG AT GG GGCTGATGGG GGAGCTGCCA GCATCTGTTC CTCTGTCATT 660 

TCTGATAACA GTAAAAGCCA GCATGGAAAA AACCGTTAAA CCGCAGGTTG GGCCTGGCCG 720 

TTGGCAGGGA AGTGGGCAGA GGGGAGGCCC GGCCAGGTCC TCCGGCAACT CCCGCGTGTT 7 80 

CTGCTTCTCC GGCTGCCCAC CTGCAGGTCC CAGCTCTTGC TCCTGCCTGT TTGCCTGGAA 840 

ATG GCC ACG CTT CTC CTT CTC CTT GGG GTG CTG GTG GTA AGC CCA GAC 
Met Ala Thr Leu Leu Leu Leu Leu Gly Val Leu Val Val Ser Pro Asp 
15 10 15 



60 
120 
180 



300 
360 
420 
480 
540 
600 



888 



GCT CTG GGG AGC 
Ala Leu Gly Ser 
20 



900 
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(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUEKCE CHARACTERISTICS: 

(A) LENGTH: 20 anino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ala Thr Leu Leu Leu Leu Leu Gly Val Leu Val Val Ser Pro Asp 
15 10 15 

Ala Lou Gly ser 
20 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2105 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 220.. 1504 

(ix) FEATURE: 

(A) NAME /KEY : polyA_signal 

(B) LOCATION: 1913.. 1918 



(ix) FEATURE: 

(A) NAME/KEY: misc_signal 

(B) LOCATION: 248.. 314 

(D) OTHER INFORMATION: /standard_name= 

"SIGNAL /MEMBRANE -ANCHORING DOMAIN" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GTGAAGTGCT CAGAATGGGG CAGGATGTCA CCTGGAATCA GCACTAAGTG ATTCAGACTT 60 

TCCTTACTTT TAAATGTGCT GCTCTTCATT TCAAGATGCC GTTGCAGCTC TGATAAATGC 120 

AAACTGACAA CCTTCAAGGC CACGACGGAG GGAAAATCAT TGGTGCTTGG AGCATAGAAG 180 

ACTGCCCTTC ACAAAGGAAA TCCCTGATTA TTGTTTGAA ATG CTG AGG ACG TTG 234 

Met Leu Arg Thr Leu 
1 5 

CTG CGA AGG AGA CTT TTT TCT TAT CCC ACC AAA TAC TAC TTT ATG GTT 282 
Leu Arg Arg Arg Leu Phe Ser Tyr Pro Thr Lys Tyr Tyr Phe Met Val 
10 15 * 20 

CTT GTT TTA TCC CTA ATC ACC TTC TCC GTT TTA AGG ATT CAT CAA AAG 330 
Leu Val Leu Ser Leu lie Thr Phe Ser Val Leu Arg lie His Gin Lys 
25 30 35 

CCT GAA TTT GTA AGT GTC AGA CAC TTG GAG CTT GCT GGG GAG AAT CCT 37 8 

Pro Glu Phe Val Ser Val Arg His Leu Glu Leu Ala Gly Glu Asn Pro 
40 45 50 
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AGT AGT GAT ATT AAT TGC ACC AAA GTT TTA CAG GGT GAT GTA AAT GAA 426 

ser ser Asp lie Asn cys Thr Lys val Leu Gin Gly Asp val Asn Glu 
55 60 65 



474 



ATC CAA AAG GTA AAG CTT GAG ATC CTA ACA GTG AAA TTT AAA AAG CGC 
He Gin Lys Val Lys Leu Glu He Leu Thr Val Lys Phe Lys Lys Arg 
70 75 80 85 

CCT CGG TGG ACA CCT GAC GAC TAT ATA AAC ATG ACC AGT GAC TGT TCT 522 
Pro Arg Trp Thr Pro Asp Asp Tyr He Asn Met Thr Ser Asp Cys Ser 
90 95 100 

TCT TTC ATC AAG AGA CGC AAA TAT ATT GTA GAA CCC CTT AGT AAA GAA 57 0 

ser Phe He Lys Arg Arg Lys Tyr He val Glu Pro Leu ser Lys Glu 
105 ~ HO 115 

GAG GCG GAG TTT CCA ATA GCA TAT TCT ATA GTG GTT CAT CAC AAG ATT 618 
Glu Ala Glu Phe Pro He Ala Tyr Ser He Val Val His His Lys He 
120 125 130 

GAA ATG CTT GAC AGG CTG CTG AGG GCC ATC TAT ATG CCT CAG AAT TTC 6 66 

Glu Met Leu Asp Arg Leu Leu Arg Ala He Tyr Met Pro Gin Asn Phe 
135 ^ 140 145 

TAT TGC GTT CAT GTG GAC ACA AAA TCC GAG GAT TCC TAT TTA GCT GCA 714 
Tyr cys Val His Val Asp Thr Lys Ser Glu Asp Ser Tyr Leu Ala Ala 
150 155 160 165 

GTG ATG CGC ATC GCT TCC TGT TTT AGT AAT GTC TTT GTG GCC AGC CGA 7 62 

Val Met Gly He Ala Ser Cys Phe Ser Asn Val Phe Val Ala Ser Arg 
170 175 180 

TTG GAG AGT GTG GTT TAT GCA TCG TGG AGC CGG GTT CAG GCT GAC CTC 810 
Leu Glu Ser Val Val Tyr Ala Ser Trp Ser Arg Val Gin Ala Asp Leu 
185 190 195 

AAC TGC ATG AAG GAT CTC TAT GCA ATG AGT GCA AAC TGG AAG TAG TTG 8 58 

Asn cys Met Lys Asp Leu Tyr Ala Met ser Ala Asn Trp Lys Tyr Leu 
200 205 210 

ATA AAT CTT TGT GGT ATG GAT TTT CCC ATT AAA ACC AAC CTA GAA ATT 9 06 

He Asn Leu cys Gly Met Asp Phe Pro He Lys Thr Asn Leu Glu He 
215 220 225 

GTC AGG AAG CTC AAG TTG TTA ATG GGA GAA AAC AAC CTG GAA ACG GAG 9 54 

Val Arq Lys Leu Lys Leu Leu Met Gly Glu Asn Asn Leu Glu Thr Glu 
230 235 240 245 

AGG ATG CCA TCC CAT AAA GAA GAA AGG TGG AAG AAG CGG TAT GAG GTC 1002 

Arq Met Pro ser His Lya Glu Glu Arg Trp Lys Lys Arg Tyr Glu Val 
250 255 260 

GTT AAT GGA AAG CTG ACA AAC ACA GGG ACT GTC AAA ATG CTT CCT CCA 1050 
Val Asn Gly Lys Leu Thr Asn Thr Gly Thr Val Lys Met Leu Pro Pro 
265 270 275 

CTC GAA ACA CCT CTC TTT TCT GGC AGT GCC TAC TTC GTG GTC AGT AGG 109 8 

Leu Glu Thr Pro Leu Phe Ser Gly Ser Ala Tyr Phe Val Val Ser Arg 
280 285 290 

GAG TAT GTG GGG TAT GTA CTA CAG AAT GAA AAA ATC CAA AAG TTG ATG 1146 
Glu Tyr Val Gly Tyr Val Leu Gin Asn Glu Lys lie Gin Lys Leu Met 
295 300 305 

GAG TGG GCA CAA GAC ACA TAC AGC CCT GAT GAG TAT CTC TGG GCC ACC 1194 
Glu Trp Ala Gin Asp Thr Tyr ser Pro Asp Glu Tyr Leu Trp Ala Thr 
310 315 320 325 
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ATC CAA AGG ATT CCT GAA GTC CCG GGC TCA CTC CCT GCC AGC CAT AAG 124 2 

lie Gin Arg lie Pro Glu val Pro Gly ser Leu Pro Ala ser His Lys 
330 335 340 

TAT GAT CTA TCT GAC ATG CAA GCA GTT GCC AGG TTT GTC JLAG TGG CAG 129 0 

Tyr Asp Leu Ser Asp Met Gin Ala Val Ala Arg Phe Val Lys Trp Gin 
345 350 355 

TAC TTT GAG GGT GAT GTT TCC AAG GGT GCT CCC TAC CCG CCC TGC GAT 1338 
Tyr Phe Glu Gly Asp Val Ser Lye Gly Ala Pro Tyr Pro Pro Cys Asp 
360 365 370 

GGA GTC CAT GTG CGC TCA GTG TGC ATT TTC GGA GCT GGT GAC TTG AAC 1386 
Gly Val His Val Arg Ser Val Cys He Phe Gly Ala Gly Asp Leu Asn 
375 380 385 

TGG ATG CTG CGC AAA CAC CAC TTG TTT GCC AAT AAG TTT GAC GTG GAT 1434 
Trp Met Leu Arg Lys His His Leu Phe Ala Asn Lys Phe Asp Val Asp 
390 395 400 405 

GTT GAC CTC TTT GCC ATC CAG TGT TTG GAT GAG CAT TTG AGA CAC AAA 1482 
Val Asp Leu Phe Ala He Gin Cys Leu Asp Glu His Leu Arg His Lys 
410 415 420 

GCT TTG GAG ACA TTA AAA CAC T GACCATTACG GGCAATTTTA TGAACAAGAA 153 4 

Ala Leu Glu Thr Leu Lys His 
425 



GAAGGATACA 


CAAAACGTAC 


CTTATCTGTT 


TCCCCTTCCT 


TGTCAGCGTC 


GGGAAGATGG 


1594 


TATGAAGTCC 


TCTTTGGGGC 


AGGGACTCTA 


GTAGATCTTC 


TTGTCAGAGA 


AGCTGCATGG 


1654 


TTTCTGCAGA 


GCACAGTTAG 


CTAGAAAGGT 


GATAGCATTA 


AATGTTCATC 


TAGAGTTAAT 


1714 


AGTGGGAGGA 


GTAAAGGTAG 


CCTTGAGGCC 


AGAGCAGGTA 


GCAAGGCATT 


GTGGAAAGAG 


1774 


GGGACCAGGG 


TGGCTGGGGA 


AGAGGCCGAT 


GCATAAAGTC 


AGCCTGTTCC 


AAGTGCTCAG 


1834 


GGACTTAGCA 


AAATGAGAAG 


AT GT GAC CTG 


TGCCAAAACT 


ATTTTGAGAA 


TTTTAAATGT 


1894 


GACCATTTTT 


CTGGTATGAA 


TAAACTTACA 


GCAACAAATA 


AT CAAAG AT A 


CAATTAATCT 


1954 


GAT AT TAT AT 


TTGTTGAAAT 


AGAAATTTGA 


TTG TAC TATA 


AATGATTTTT 


GTAAATAATT 


2014 


TATATTCTGC 


T C T AAT AC TG 


TAC TG TGT AG 


TGTGTCTCCG 


TATGTCATCT 


CAGGGAGCTT 


2074 


AAAATGGGCT 


TGATTTAACA 


TTGAAAAAAA 


A 






2105 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Leu Arg Thr Leu Leu Arg Arg Arg Leu Phe Ser Tyr Pro Thr Lys 
1 5 10 ~ 15 



Tyr Tyr Phe Met Val Leu Val Leu Ser Leu lie Thr Phe Ser Val Leu 
20 25 30 
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Arq He His Gin Lys Pro Glu Phe Val Ser Val Arg His Leu Glu Leu 
35 40 45 

Ala Gly Glu Asn Pro Ser Ser Asp He Asn Cys Thr Lys Val Leu Gin 
50 55 60 

Gly Asp val Asn Glu He Gin Lys Val Lys Leu Glu He Leu Thr Val 
65 70 75 80 

Lys Phe Lys Lys Arg Pro Arg Trp Thr Pro Asp Asp Tyr He Asn Met 
85 90 95 

Thr Ser Asp cys Ser Ser Phe He Lys Arg Arg Lys Tyr He Val Glu 
100 . 105 110 

Pro Leu Ser Lys Glu Glu Ala Glu Phe Pro He Ala Tyr ser He Val 
115 120 125 

Val His His Lys He Glu Met Leu Asp Arg Leu Leu Arg Ala lie Tyr 
130 " 135 140 

Met Pro Gin Asn Phe Tyr cys Val His Val Asp Thr Lys Ser Glu Asp 
145 150 155 160 

Ser Tyr Leu Ala Ala Val Met Gly He Ala Ser Cys Phe ser Asn Val 
165 170 175 

Phe Val Ala Ser Arg Leu Glu Ser Val Val Tyr Ala Ser Trp ser Arg 
180 " 185 190 

Val Gin Ala A3p Leu Asn Cys Met Lys Asp Leu Tyr Ala Met Ser Ala 
195 200 205 

Asn Trp Lys Tyr Leu He Asn Leu Cys Gly Met Asp Phe Pro He Lys 
210 215 220 

Thr Asn Leu Glu He val Arg Lys Leu Lys Leu Leu Met Gly Glu Asn 
225 230 " ' 235 240 

Asn Leu Glu Thr Glu Arg Met Pro ser His Ly b Glu Glu Arg Trp Lys 
245 250 255 

Lys Arg Tyr Glu Val Val Asn Gly Lys Leu Thr Asn Thr Gly Thr Val 
260 265 270 

Lys Met Leu Pro Pro Leu Glu Thr Pro Leu Phe Ser Gly ser Ala Tyr 
275 280 285 

Phe Val Val Ser Arg Glu Tyr Val Gly Tyr Val Leu Gin Asn Glu Lys 
290 295 300 

He Gin Lys Leu Met Glu Trp Ala Gin Asp Thr Tyr Ser Pro Asp Glu 
305 - 310 315 320 

Tyr Leu Trp Ala Thr He Gin Arg He Pro Glu Val Pro Gly Ser Leu 
325 330 335 

Pro Ala Ser His Lys Tyr Asp Leu Ser Asp Met Gin Ala Val Ala Arg 
340 345 350 

Phe Val Lys Trp Gin Tyr Phe Glu Gly Asp Val ser Lys Gly Ala Pro 
355 360 365 

Tyr Pro Pro Cys Asp Gly Val His Val Arg Ser Val Cys He Phe Gly 
370 375 380 
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Ala Gly Asp Leu Asn Trp Met Leu Arg Lys His His Leu Phe Ala Asm 
385 390 395 401 

Lys Phe Asp Val Asp Val Asp Leu Phe Ala lie Gin Cys Leu Asp gIe 
405 410 415 

His Leu Arg His Lys Ala Leu Glu Thr Leu Lys His 
420 425 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) length: 34 base pairs 

(B) TTPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

TTTGAATTCC CCTGAATTTG TAAGTGTCAG ACAC 3 4 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 33 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
TTTGAATTCG CAG AAAC CAT GCAGCTTCTC TGA 33 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 
(v) FRAGMENT TYPE: internal 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..15 

(D) OTHER INFORMATION: /note= M PROTEIN A - C2GNT FUSION 
PROTEIN" 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



GGG AAT TCC CCT GAA 
Gly Asn Ser Pro Glu 
1 5 



15 
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(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Gly Asn Ser Pro Glu 
1 5 
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CLAIMS 

We claim: 

1. A purified human protein or an active 
fragment thereof having fi 1 -> 6 N - 

5 acetylglucosaminyltransf erase activity. 

2. The purified protein of claim 1, wherein 
said activity is that of UDP-GlcNAc :Galfll->3GalNAc (GlcNAc 
to GalNAc) fil-*6 N-acetylglucosaminyltransf erase. 

3. The purified protein of claim 2, wherein 
10 said protein has a relative molecular weight of about 50 

kD. 

4. An isolated nucleic acid encoding the human 
protein or active fragment thereof of claim 1. 

5. A vector containing the nucleic acid of 

15 claim 4. 

6. The vector of claim 5, wherein said vector 
is a plasmid. 

7. The vector of claim 5, wherein said vector 
is pcDNAI-C2GnT. 

20 8. A host cell containing the vector of claim 

5. 

9. A purified human protein or a fragment 
thereof that is an acceptor molecule , said acceptor 
molecule being acted upon by the protein of claim 2 having 
activity which exclusively forms core 2 oligosaccharide 
5 structures in O-glycans. 
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10. The acceptor molecule of claim 9, wherein 
said acceptor molecule is leukosialin, CD43. 

11. An isolated nucleic acid encoding the 
acceptor molecule of claim 9. 

12. A vector containing the nucleic acid of 

claim 11. , 

13. The vector of claim 12 , wherein said vector 
is a plasmid. 

14. The vector of claim 12, wherein said vector 
is pcDSRa-leu. 

15. A host cell containing the vector of claim 

12. 



16. A method of obtaining from a cell line, 
which does not normally contain a protein having catalytic 
activity or an acceptor molecule for said protein, a 
nucleic acid encoding said protein having catalytic 
5 activity comprising: 

a. transfecting said cell line with a DNA 
sequence encoding the acceptor molecule, wherein the 
acceptor molecule is stably expressed in the cell line; 

b. transfecting said cell line with a cDNA 
10 library containing said nucleic acid in a vector, wherein 

proteins encoded by the transfected cDNA are transiently 
expressed; 

c. screening the transfected cells for 
expression of said protein having catalytic activity; and 

15 d. isolating the nucleic acid encoding the 

protein having catalytic activity. 
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17. The vector of claim 16, wherein said vector 
replicates in the transfected cell line. 

18. The vector in claim 17, wherein said vector 
is a plasmid. 

19. The vector of claim 16, wherein said vector 
contains a viral replication origin. 

20. The vector of claim 19, wherein said 
replication origin is the polyoma virus replication origin. 

21. The cell line of claim 16 , wherein said cell 
line supports replication of a vector. 

22. The cell line of claim 16, wherein said cell 
line expresses polyoma virus large T antigen. 

23. The cell line of claim 16, wherein said cell 
line is the Chinese hamster ovary cell line. 

24. The cell line of claim 23, wherein said cell 
line is CHO-Py-leu. 

25. A method of isolating a polypeptide having 
catalytic activity that forms core 2 oligosaccharide 
structures in O-glycans, said method comprising growing the 
host cell of claim 8 under conditions which favor 

5 expression of a nucleic acid encoding said polypeptide, and 
isolating said polypeptide so produced. 
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